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(g) Asynchronous remote data copying. 

@ Asynchronous remote data duplexing at a 
distant location is perfotned from copies based 
at a primary site storage subsystem using first 
and second pluralities of subsystems 12, 14 at 
primary and remote sites respectively. Each of 
the first plurality of subsystems is indepen- 
dentiy coupled to one or more of the second 
plurality of subsystems. The first plurality 
subsystems are interconnected and the second 
plurality of sul^tems are also interconnected. 
The method utSizes checkpoint messages to 
maintain sequence integrity between ttie first 
and second plurality of subsystems witfiout ttie 
use of a centralized oommunicattons service. 
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The present invention relates to asynchronous 
r m te data copying or duplexing to provide data 
pres rvation in an information handling system, 
wh r by information from a primary site storage sut>- 
syst m can be copied to a rem te location. 5 

Data copying is one form of data preservation in 
an information handling or computer system. How- 
ever, data preservation via data copying must take 
many factors into account This is of special signifi- 
cance where it is anticipated that data copied and io 
stored at a remote site would be the repository for any 
continued interaction with the data should the work 
and data of a primary site beowne unavailable. The 
factors of interest in copying include the protection do- 
main (system and/or environmental failure or device is 
and/or media failure), data loss (no loss/partial loss), 
time where copying occurs as related to the occur- 
rence of other data and processes (point in time/real 
time), the degree of disruption to applications execut- 
ing on said computer, and whether the copy is appli- 20 
cation or storage system t>ased. With regard to the 
last factor, application based copying involves log 
files, data files, and program routines while storage 
based copying involves an understanding of direct ac- 
cess storage device (DASD) addresses with no 25 
knowledge of data types or application use of the 
data. 

Real-time remote data duplexing systems require 
some means to ensure update sequence integrity as 
write updates to the secondary or remote DASD data 30 
copy. One way to accomplish this is to provide a syn- 
chronous system to control the DASD subsystems. In 
such a system, the primary DASD write operation 
does not complete until a copy of that data has been 
confirmed at a secondary location. The problem with 35 
such synchronous systems is that they slow down the 
overall operation of the duplexing system. 

Asynchronous copy systems accomplish se- 
quence Integrity through a centralization and consol- 
idation of data communications between primary and 40 
secondary DASD subsystems through a central com- 
munications system. In such systems, a system at the 
primary site can determine the sequence among dif- 
ferent update write operations among ail DASD sub- 
systems at the primary site and communicate that in- 45 
formation to the DASD subsystem at the remote site. 
The secondary subsystem in turn uses the sequence 
information from the primary to control the applica- 
tion of update data to the secondary DASD data copy. 
Known asynchronous copy systems that utilize cen- 
tralized data conrununications are described below. 

Mcllvain and Shomler, U.S. patent application 
number 08/036.017 entitled "Method and Means for 
Multi-System Remote Data Duplexing and R covery" 
(EPA 617362) describes the us of a store and for- 
ward m ssage interface at the DASD storag man- 
ag ment level between a source of update copies and 
a r mote site in a host to host coupling in which the 



difference in update completeness or loss of the se- 
qu nee of write updates could be oomplet ly speci- 
fied in th vent of interruption. 

Cheff tz, et al., U.S. Pat nt 5,133,065 entiti d 
"Backup Computer Program for N tworks" issued 
July 21, 1992, discloses a local area network (LAN) 
having a file server to which each local node creates 
and transmits a list of local files to be backed-up. 
Such remote generatton reduces the traffic where a 
network server initiates the list creation and file copy- 
ing activity. Arguably, art published before this refer- 
ence taught centrally administered file selectton. This 
resulted in compromises to local node security and 
overuse of the server. This is presumpth^ely avoided 
by Cheffetz's local node generated lists and remis- 
sion of the lists to the file server. 

Beale, et al., U.S. Patent 5.155,845 entitled "Data 
'Storage System for Providing Redundant Copies of 
Data on Different Disk Drives", copies variable length 
records (CKD) on two or nnore external stores by 
causing a write to be processed by the f irst storage 
controller and be communicated in parallel over a di- 
rect link (broad band path) to the second storage con- 
troller obviating the path length limltatton between 
the primary and remote copy sites. Such a limitation 
is occasbned by the feet that CKD demand/response 
architecture s length limited to in the range of 150 
meters. 

Another example of an asynchronous system that 
utilizes a centralized system is disclosed in U.S pa- 
tent appticatbn entitled "Remote Data Duplexing 
Asynchronous Informatfon Packet Message, by 
Micka et al. (EPA 602822). This system discloses a 
system for asynchronously duplexing direct access 
storage device (DASD) data in a plurality of DASD 
subsystems and has the advantage of decoupling the 
data duplexing operatbn from the DASD write I/O op- 
eration. This ensures the write does not incur unnec- 
essary wait states in the subsystem. By establishing 
a sequence checkpoint at which time a set of informa- 
tion packets are grouped together and processed as 
a single sequence unit, this decoupling and indepen- 
dent operation takes place. Through this indepen- 
dence, data copying to a secondary location can take 
place without affecting the performance of the sub- 
systems and also not affecting the corresponding in- 
tegrity of the data that is being updated. 

There are systems in use in which there is no cen- 
tralized comnnunication service between the primary 
and secondary locations. Oftentimes in such system 
configurations each primary subsystem has a direct 
independent link to a selected secondary subsystem. 
In such a system that includes subsystems providing 
sequence consistent asynchronous write operations 
can not be address d utilizing known asynchronous 
copy schemes. 

Accordingly, the invention provides a method of 
operating an asynchronous remote data copy system 
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including a primary site having a first plurality of sub- 
systems interconnected by a first coupling means, 
and a secondary site remote from the primary site 
having a second plurality of subsystems interconnect- 
ed by a second coupling means, each of the second 
plurality of subsystems being independently coupled 
to a counterpart one of the first plurality of subsys- 
tems, said method comprising the steps of: 

in the first plurality of subsystems; 

sending a checkpoint signal to each of the first 
plurality of subsystems; 

sending updated data and the checkpoint sig- 
nal to each of the counterpart coupled second plural- 
ity of subsystems'; and 

in the second plurality of subsystems; 

receiving the updated data and checkpoint sig- 
nals; and 

coordinating the writing of the updated data 
based upon tlie checkpoint signals. 

Such method provides independent links be* 
tween primary and secondary subsystems, with no 
central communications system, and provides for se- 
quence consistent remote copying from one set of 
multiple subsystems at one location to a second set 
of multiple subsystems at a second subsystem. The 
asynchronous copy system is simple, cost effective 
and does not significantly impede the overall opera- 
tion of the subsystems. 

In a preferred embodiment the method further 
comprises the step of distributing a sequence signal 
at the primary site, said checkpoint signal having a 
predetermined relattonship with the sequence signal. 
Such a sequence signal is used to ensure synchron- 
isation between the various subsystems. However, if 
the network can ensure quick and reliable distribution 
of the checkpoint messages, then in some situations 
it is possible to dispense with the sequence signal. 

In the prefen-ed embodiment, the method further 
comprises the steps of: activating each of the first 
plurality of subsystems to communicate with the other 
subsystems In said first plurality of subsystems; acti- 
vating each of the first plurality of subsystems to com- 
municate with its counterpart coupled subsystem in 
the second plurality of subsystems; building at least 
one configuration table in each of the f plurality of 
subsystems such that each of the first plurality of sub- 
systems can identify all of the other subsystems in 
said first plurality of subsystems; and synchronizing 
the first plurality of subsystems; the coordinating step 
further comprises the steps of: receiving copy active 
messages in the second plurality of subsystems; 
building copy active tables in the second plurality of 
subsystenns; synchronizing copy operations f the 
second plurality of subsystems; receiving the ch ck- 
point messages in the second plurality of subsys- 
tems; performing a rendezvous for all ch cdcpoint 
m ssages in the second pluraiity of subsystems; and 
applying the updated data at the second plurality of 



subsystenns. 

Th method of the preferred embodiment furth r 
comprises the steps of: causing each of said first plur- 
ality of subsystems to asynchronously generate a se- 

5 quence of updates; ordering each of said sequences 
in accordance with the received sequence signal and 
checkpoint signal; asynchronously communicating 
sequences of updates from each of the first plurality 
of subsystems Into a buffered portion of a counterpart 

10 subsystem; and applying the buffered sequences at 
the second plurality of subsystems as a functton of 
each checkpoint message. 

The invention additionally provides a method of 
operating an asynchronous remote data copy system 

f 5 including a primary site having a first plurality of sub- 
systems interconnected by a first coupling means. 
, and a secondary site remote from the primary site 
having a second plurality of subsystems interconnect^ 
ed by a second coupling means, each of the second 

20 plurality of subsystems being independently coupled 
to a counterpart one of the first plurality of subsys- 
tems, said method comprising the steps of: 

(a) at the primary site responshfe to initiation of 
a start copy operation, ascertaining at each sub- 

25 system the subset of the plurality of DASD sub- 

systems forming the copyset group, designating 
one of the plurality of subsystems as a clocking 
and checkpoint message source, each check- 
point message indudihg a sequence dock value 

30 and an increased sequence number; 

(b) at the secondary site, repeating step (a) for 
counterpart ones of the DASD subsystems; 

(c) at the primary site, periodically generating 
clocking signals and checkpoint messages by 

35 said designated subsystem and broadcasting 
said signals and messages as they occur to other 
subsystems forming the copyset group Induding 
itself, at each subsystem in the copyset group, 
and 

40 (1) asynchronously forming a local sequence 

of updated records. 

(2) embedding said signals and messages 
into the sequence to form a time discriminated 
total ordering of updated records, and 
4$ (3) remitting at least a portion of the sequence 

to a buffer portion of the counterpart DASD 
subsystem at the secondary site; and 

(d) at the secondary site, applying a checkpoint 
message to the designated subsystem operative 

50 as a synchronizing source by each subsystem in 
the copyset group and writing the sequences or 
portions thereof stored in the buffered portions of 
sakf subsystems t DASD only upon a signal 
from said designated subsyst m indicative of its 
55 receipt of alls nd messages. 

Each ch ckpoint m ssage signifi s that all 
DASD records with a d ck signal with a sequence 
number that is less than the checkpoint m ssage se-. 
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quence dock value hav be n transmitted to th 
count ipart secondary subsyst m. 

The invention further provides an asynchronous 
remote data copy syst m including a primary site hav- 
ing a first plurality of subsystems interconnected by 
a first coupling means, and a secondary site remote 
from the primary site having a second plurality of sub- 
systems interconnected by a second coupling means, 
each of the second plurality of subsystems being in- 
dependently coupled to a counterpart one of the first 
plurality of subsystems, 

the first plurality of subsystems including 
means for sending a checkpoint signal to each of the 
first plurality of subsystems; and means for sending 
updated data and the checkpoint signal to each of the 
counterpart coupled second plurality of subsystems; 

and the second plurality of subsystems includ- 
ing means for receiving the updated data and check- 
point signals; and means for coordinating the writing 
of the updated data based upon the checkpoint sig- 
nals. 

Preferably, the checkpoint signal sending means 
comprises: means for sending a checkpoint message 
to the first plurality of subsystems; and means re- 
sponsive to the checkpoint message sending means 
for inserting the checkpoint message into an update 
data sequence from each of said first plurality of sub- 
systems to its counterpart subsystem. 

Thus the above method and means may be used 
for the remote copying of data at a secondary DASD 
subsystems. Typically in such an implementation, the 
secondary DASD subsystems are peer coupled to 
counterpart ones of a primary host attached DASD 
subsystems. A start copy operation is initiated at the 
host by a message broadcast to all primary DASD 
subsystems. Each primary builds a configuration ta- 
ble using a local area network or other suitable means 
as a set associative device for table building. This ta- 
ble identifies all the primary subsystem participants. 
Also, each primary synchronizes its local dock or 
other time reference to that of a designated primary 
subsystem, the designated primary being operathre 
as a docking and checkpoint message source. The 
designated primary periodically sends checkpoint 
messages having a sequence time value and an in- 
creasing checkpoint sequence number, to each pri- 
mary subsystem. These are logically inserted into its 
local copy write record sequence. At the secondary 
host attached DASD subsystems, one of the second- 
ary subsystems is designated as a local synch 
source. Each secondary subsystem builds a config- 
uration table of copy acth/e subsystems, and couples 
to the counterpart primary subsystem. Next, each 
secondary subsystem asynchronously receives and 
locally buffers a copy s quence from its primary 
counterpart. Each receiv d s qu nee includes 
ch ckpoint messages mbedded therein at the pri- 
mary after all time stamped write updated records 



have been s ntR sponsive to receipt of a checkpoint 
message, th secondary subsystem will remit it to the 
local synch source. A rendezvous is executed by this 
source over all the checkpoint messages. Thus, each 

5 secondary subsystem writes from buffer to DASD 
upon receipt of the rendezvous completion from the 
synch source. If one or more of the primary DASD 
subsystems become unavailable, then their counter- 
parts at the secondary site resign from the copy- 

10 group. Such occurs only after completion of any up- 
dates in progress. 

This approach allows for distributed non-central- 
system operated control of a sequence-consistent 
real-time asynchronous data copy from a set of 

15 DASD subsystems at a primary location to a set of 
DASD subsystems at a secondary location, with pri- 
mary subsystems separately and independently con- 
nected to secondary subsystems. This enables up- 
date sequence integrity of a data copy at a plurality 

20 of subsystems remote from a source of asynchro- 
nously independently generated sequence of write 
operations, there being a first plurality of subsystems 
at a primary site, the first plurality of subsystems be- 
ing interconnected by a first coupling means, and a 

25 second plurality of subsystems at a site remote from 
the primary site, the second plurality of subsystems 
being interconnected by a second coupling means, 
each of the second plurality of subsystems being in- 
dependently coupled to one of the first plurality of 

30 subsystems. 

Thus in a system for remote copying of data at a 
secondary site having a coupling plurality of storage 
device (DAS) subsystems that are interconnected by 
a first couplingmeans, said secondary DASD subsys- 

35 terns being coupled to counterpart ones of a plurality 
of DASD subsystems at a remote primary site, said 
secondary DASD subsystems being interconnected 
by a second coupling means, said system induding 
means at the primary site for initiating a start copy op- 

40 eratton, a method is provided comprising the steps of: 

(a) at the primary and secondary sites, forming m 
copyset groups of DASD subsystems respective- 
ly; 

(b) at the primary site: 

45 (1) causing each of the m subsystenrts to asyn- 

chronously generate a sequence of updated 
write records; 

(2) ordering each of said sequences by em- 
bedding common dock values and a periodic 

50 checkpoint message with a common dock 

value and increasing a sequence number in 
each of said sequences, 

(3) coupling count rpart DASD subsyst ms 
between the sites and asynchronously conv 

55 municating s quences from each of the pri- 

mary sit subsystems into a buffer d portion 
of a counterpart secondary site subsyst m; 
and 
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(c) at th secondary site, writing the buffer d se- 
quences to DASD in the counterpart subsystems 
as a function of each checkpoint message. 
An embodim nt of the invention will now be de- 
scribed in d tail by way of example only, with refer- 
ence to the following drawings: 

Figure 1 is a conventional remote data copy sys- 
tem configuration; 

Figure 2 is remote dual copy system configured 
In accordance with the present invention; 
Figure 3 is a flow chart showing the general op- 
eration of the reinote dual copying of Figure 2; 
and 

Figure 3A-3D are more detailed flow charts 
showing the operation of the remote dual copying 
system of Rgure 2. 

To perform asynchronous remote copying, (1 ) the 
sequence of data updates must be determinable at a 
local site; (2) that sequence must be communicable 
by the local site to a remote site; and (3) the remote 
site must be able to use the sequence to control the 
updating at the remote site. 

As has been before mentioned, prior art asyn- 
chronous copy systems which include multiple sub- 
system require a central system for providing the ap- 
propriate sequences of data. However, system con- 
figurations may be found in which it is not desired to 
pass update data between the primary and second- 
ary locations through a centralized communication 
service. Rather in those conf iguratbns it is desired to 
directly connect DASD subsystems at the primary 
and secondary sites via independent subsystem-to- 
subsystem communication links. Such a system 10 Is 
shown in Figure 1. The system 10 Includes a host 11 
which provides data to primary subsystems 12*. As is 
shown each of the links between a primary DASD 
subsystem 12' and its peer coupled secondary DASD 
subsystem 14' is independent As a result, this type of 
system would inherently be incapable of write opera- 
tions that are sequentially consistent 

To more specifically describe the problem, con- 
sider as an example a sequence of three writes from 
a conventional database management system 
(DBMS) as it is about to commit a transaction. The ex- 
ample is representative of an information manage- 
ment service (IMS) system: 

1. The DBMS writes to its log data set; the record 
written contains old data base (DB) data, new DB 
data (that is being changed by this transaction), 
and a record of its intent to commit (finalize) this 
transactk>n. 

2. The DBMS waits for a DASD I/O operation to 
rep rt that it has compi ted, then it updates its 
data bas data sets, which are diff rent volumes 
that ar on a diff rent DASD subsyst ms. This 
writing of th new DB data overwrites and thus 
d streys the old DB records. 

3. The DBMS waits for the second DASD I/O op- 



eration to be posted complete, th n it writ s a 
commit record to its log data set on the first vol- 
ume. This commit record 'guarantees' to future 
IMS recovery processes that the data bas data 
5 set (DASD volumes) have been updated. 

Now consider operation of an asynchronous re- 
mote copy system using a multiple subsystem conf 
uratton such as illustrated in Figure 1. In this embodi- 
ment, the DBMS log data set is on a volume conf ig- 
10 ured to the topmost DASD subsystem pair (DASD 1 
and DASD V) and the data base volumes are on the 
DASD subsystem pair shown second from the top, 
(DASD 2 and DASD 2'). In this example, the primary 
subsystem 1 Is lightly loaded, thus it processes its 
f 5 queued work with no delay, while the primary subsys- 
tem 2 is heavily loaded and is experiencing some de- 
, lay in processing its queued work. Queued work in- 
cludes the forwarding of updated data to its remote 
copy peer subsystem. 
20 The following sequence would describe the oper- 
ation in the present example: 

1. Write I/O (1) is completed from applicatton to 
subsystem 1. 

2. Write I/O (2) is completed from application to 
25 subsystem 2. 

3. Primary Subsystem 1 sends data for I/O (1) to 
secondary subsystem 1, which applies It to its 
cache memory copy of the DASD volume. 

4; Write I/O (3) is completed from applicatfon sys- 
30 tern to subsystem 1. 

5. Primary Subsystem 1 sends data for I/O (3) to 
secondary subsystem 1, which applies it to. its 
cache memory copy of the DASD volume. 

6. Primary Subsystem 2 sends data for I/O (2) to 
35 secondary subsystem 2, which applies to its 

cache memory copy of the DASD volume, 
if a primary site failure occurs after step 5 and be- 
fore step 6 there would be corrupted data copied into 
the system. Since the failure rendered primary sub- 

40 systems inaccessible, the data from I/O (2) will not be 
at the secondary site. 

The essence of data sequence consistency in a 
remote copy service Is to ensure at such a time as the 
remote DASD must be used for recovery, that the new 

45 data from second operation I/O above will only be 
seen if the data from the f tret I/O is also present, and 
that the data from the third I/O will only be present if 
the second I/O is also present Gonskler the data in- 
tegrity loss if the third I/O were present but if the sec- 

50 ond 1/0 was not The DBMS log would tell recovery 
processes that the data base would receive valid 
data. This would either result in a DBMS failure or 
business application nror. 

The s quence of I/O operations at the primary 

55 subsystems is DASD 1 . DASD 2, then back t DASD 
1 ; but because of th load on DASD subsystem 2, it 
is d iayed in sending its I/O (2) such that it has not ar- 
rived by th tim I/O (3) from DASD subsyst m 1 was 
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rec ived by s c ndary DASD subsystem 1'. With no 
control of transmission subsystem data to copy vol- 
um s, subsystem 1 w uld update its copy volume 
with both 1 and 3. If at that tim a disast r b fell that 
primary system and operations wer directed to re- 5 
sume at the secondary site (such contingency being 
the reason for having a real-time remote DASD copy), 
the recovering DBMS would find a log record that said 
that data base records were written (in I/O 2) while the 
data for I/O 2 would not be on the DASD of secondary io 
subsystem 2. 

With such independent links, determination of se- 
quence of writes among independent primary DASD 
subsystems, communication of that information to the 
set of subsystems at the secondary location, and con- is 
trd of the sequence of updates anrK>ng the indepen- 
dent DASD subsystems at the secondary has been a 
problem heretofore. However, as explained in detail 
below, a system and method are provided for se- 
quence identification, communication, and update 20 
control that penmit sequence-consistent remote 
DASD data copy from multiple independent DASD 
subsystems at one location to a set of secondary 
DASD subsystems, each subsystem being indepen- 
dently connected to peer subsystems at the first Io- 25 
cation. 

Figure 2 is a remote dual copy system 20 includ- 
ing a host 11 which provides data to primary subsys- 
tems 12. The system 20 includes a first group of 
DASD subsystems 1 2 which are located at a primary 30 
site and a second group of DASD subsystems 14 
which are located at a site that is remote from the pri- 
mary site. The system 20 includes couplers 1 6 and 1 8 
which provide for interconnections of the DASD sub- 
systems 12 and 14, respectively. 35 

Referring now to Figure 2, system 20 for achiev- 
ing update sequence integrity without a centralized 
communication system is based on the presence of 
the following configurations. 

1. Multiple primary location D/VSD subsystems 40 
each interconnected via one or more communica- 
tion links to a peer DASD subsystem at the sec- 
ondary location. 

2. A coupling connection 16 of all subsystems at 

the primary site, and a similar coupling connec- 45 
tion 18 of all subsystems at the secondary site. 
While not shown, one of ordinary skill in the art 
will readily recognize multiple physical connec- 
tions may be incorporated for connection redun- 
dancy. The term "coupling" is used for conve- so 
nience; any suitable physical network intercon- 
nection means may be used such as a local area 
network (LAN) or the like. 

3. Each subsystem has a 'dock* or similar syn- 
chronizing signal proc ss that can be synchron- 55 
ized with a value from anoth r subsystem, com- 
municated via the coupling 16. 

Thre steps are utiliz d to achiev copy update 



sequence integrity: (1) determination f s quence of 
write operations among all DASD subsystems at the 
primary; (2) communication f that s quence infor- 
mation to th secondary; and (3) use f that informa- 
tl n by the secondary DASD subsyst ms to control 
the sequence of update writes across all secondary 
DASD. These three steps are described in more detail 
hereinbelow for a copy system using LAN intercon- 
nections at the secondary and primary subsystems. 

1 . Use of the LAN interconnection among the pri- 
mary DASDs to distribute a sequencing signal, 
such as a time dock, to all subsystems is used to 
associate a sequence/time value with each 
DASD write of data to be copied to a secondary 
DASD subsystem (and the sending of that value 
along with update data to the secondary); 

2. Propagation of periodic synchronizing-time- 
denominated checkpoint signals among the pri- 
mary DASD subsystems that in turn are commu- 
nk:ated by each primary subsystem to Its peer- 
connected secondary subsystem(s); and 

3. Use of the LAN interconnection among sec- 
ondary DASD subsystems to coordinate their 
DASD update writing of copy data receh^ed from 
primary subsystems. 

Referring now to Figure 3, what is shown is a flow 
chart of the general operation 1 00 of such a system 
that is located within the primary subsystem. First, a 
start copy operation is sent to all the primary subsys- 
tems to activate communication between primary 
subsystems, via step 102. Then indhridual subsys- 
tems provide the appropriate sequence infbmDation to 
all the copy active primary subsystems, via step 104. 
Thereafter the primary subsystems corrvnunicate the 
sequence Information to the peer coupled secondary 
subsystenns, via step 106. Finally, that sequence in- 
formation is utilized to control secondary subsystem 
updates via step 108. 

These steps are described in detail below: 

Start copy operation (step 102) 

Remote copy operatbns for each DASD subsys- 
tem must be started by some instruction to that sut>- 
system. The instruction may come via command from 
a host system (the same system as may be executing 
applications that write to DASD), or it may be provided 
via subsystem-local interface such as from a subsys- 
tem operator console or similar means. 

Description of DASD write sequence at the primary 
(step 104) 

Refer now to Figure 3A and the foil wing discus- 
sion. Irrespective of whether the conrvnand comes 
from the host system or from subsystem local inter- 
face, the start copy instructfon id ntif ies the DASD to 
becopi d and causes th subsystem to activate com- 
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munications with secondary subsystem(s) to which It 
is connected via step 202. 

Th start copy operation also activates commu- 
nication from that DASD sut)syst m to other primary 
DASD subsystenns. The LAN connection addresses 5 
of other primary DASD subsystems may be config- 
ured to each subsystem or it may be incorporated in 
information contained in the start copy instruction to 
the subsystem. When copy is started at a subsystem, 
each subsystem sends a subsystem copy active mes- io 
sage to all other primary subsystems it has addresses 
for via step 204. All primary subsystems build and 
maintain configuration tables such that each subsys- 
tem Icnows the identity of all primary subsystems par- 
ticipating in the copy process via step 206. Each sub- is 
system receiving an copy acthfe message from an- 
other subsystem marks that subsystem as a copy ac- 
tive in its configuration tables. 

As a part of exchanging copy active messages 
with the other primary systems, the subsystems syn- 20 
chronize their sequence dock processes and select 
one subsystem to be a master source for a timing val- 
ue synchronization signal via step 208. This is a dock 
synchronlzatton process, not described here since 
such processes are well known in the art Note that 25 
dock synchronization must be able to maintain dock 
drift such that maximum drift is substantially less 
than the time for a subsystem to receive a write conrv 
mand from a system, perform a write to cache, signal 
end of I/O operation, and for the host to process the 30 
I/O completbn status and start a new DASD I/O write 
operation. 

As write operations are performed by each sub- 
system, in a prefenred emt}odiment, the subsystem in- 
cludes the then current time sequence signal value 35 
with other control informatk>n that is sent with the 
DASD data to its connected secondary subsystem. 
The primary system DASD write I/O operation contin- 
ues without delay, extended only by the time neces- 
sary to construct control information for the data to be 40 
sent to the secondary. DASD and control data are 
buffered in the secondary subsystem on receipt Up- 
date of secondary DASD copy is deferred until re- 
leased by secondary sequence control (described 
below). 45 

Communication of sequence information to 
secondary DASD subsystems (step 106) 

Refer now to Figure 3B and the following discus- so 
sk>n. The subsystem that is providing the time syn- 
chronizing signal source will periodically send a 
checkpoint message to all primary copy-active sub- 
systems via step 302. This may be induded with the 
time sync signal or sent separately as appropriate for ss 
the local int rconn ctbn protocol used. Th ch ck- 
polnt message indudes a s quence time value and a 
checkpoint s quence number that is incremented by 



one for each checkpoint messag . Each subsystem 
on recehring the checkpoint communication will logi- 
cally Insert it in its transmission stream to the second- 
ary subsystem(s) to which It is connected, sending it 
to secondary subsystem(s) only after all DASD and 
control information with an earlier sequence time sig- 
nal value have been sent via step 304. 

Use of sequence Information by secondary DASD 
subsystems to control secondary DASD updates 
(step 108) 

Refer now to Rgures 3C and 3D and the following 
discussbn. Copy operatbns in secondary systenns 
are activated by copy active messages from their 
connected primary subsystems via step 402. (An 
enabling startup instruction from a system or local 
console at the secondary may also be required. That 
aspect of copy operation and control is performed in 
a conventbnal manner.) Secondary subsystems 
build and maintain copy active tables such that each 
subsystem has data relating to the identity of all sec- 
ondary subsystems partidpating in the copy process, 
via step 404. Asynchronizing control master is select- 
ed from among the secondary subsystenr^, using the 
local interconnect (eg a LAN) similar to the nr^nner in 
which a primary synchronizing signal source was se- 
lected, via step 406. Such distributed control 
schemes are well known and need not be described 
here. 

Secondary subsystems receive and buffer DASD 
data and assodated control Information from their 
connected primary subsystems. This received data 
and control information is logically grouped and se- 
quenced by the primary synchronizing signal value. 
For maximum protection, recehred data and control 
information buffering should be in a non-volatile stor- 
age medium. 

At some point, each subsystem will receive a 
checkpoint control message via step 408 that signi- 
fies that all DASD update data and control with a pri- 
mary synchronizing signal value equal to or less than 
the checkpoint time sync value has been sent to that 
secondary subsystem. A checkpoint message, when 
receded, is sent by the receiving subsystem to the 
secondary master subsystem, which performs a ren- 
dezvous for that checkpoint message from all copy- 
active secondary subsystems, induding itself, via 
step 410. 

When the rendezvous is complete via step 41 2 for 
a gWen checkpoint value, the secondary master sub- 
system sends a message to all secondary copy- 
acth^e subsystems to release update data up to the 
primary s qu nee time value fthech ckpointEach 
subsystem then marks itself in an internal update In 
progr ss state and applies the updates t its second- 
ary DASD copy via st p 414. (Not : Th actual proc- 
ess of applying the buff red DASD copy data may re- 
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quire only adjusting cache directory entries.) 

The update in pr gress state must be maintain d 
through sutoyst m resets and pow r off al ng with 
buffered data and control Into from the primary (and 
other copy service stat information). It Is us d as a 5 
must-complete type operation control that precludes 
any host system access to secondary DASD when 
that state is active. This ensures that an application 
takeover at the secondary cannot see partial updated 
DASD data. That is, it can not access a sequence-in* io 
consistent DASD copy. Update for the checkpoint in- 
terval must be complete before a user can access the 
copy DASD subsystems records. 

When the updating of DASD copy data has conv 
pleted. the subsystem sends an update complete for is 
checkpoint message (identifying the specific check- 
point) for that checkpoint to the secondary master, via 
step 416. When update complete signal for check- 
point messages have been received at the master 
from all subsystems including the master, the master 20 
then sends a reset update in progress state message 
to all secondary subsystems to allow secondary copy 
data to again be accessible to attached systems, via 
step 418. 

In a variation to the preferred embodiment, if the 25 
coupling means among all primary subsystems can 
reliably propagate every checkpoint message to all 
subsystems in substantially less time than the time 
for an I/O operatk>n cyde then the preceding process- 
es could function without a dock and dock sync. The 30 
arrival time of a checkpoint message at each primary 
subsystem would be predse enough to define update 
sequence. All updates within a checkpoint would be 
considered to have occurred at the same time. 

The steps of subsystem operation for the three 35 
l/Os described previously, using the approach descri- 
bed above is discussed herein below. 

1. DASD 1, 2 and 3, and 4 exchange 'copy ac- 
tive* messages. Primary subsystem 3 has be- 
come the master source for timing value sync stg- 40 
nal. Secondary system 4 has become the sec- 
ondary master for rendezvous of checkpoint mes- 
sages. 

2. Write I/O (1) is completed from application to 
subsystem 1 at time 'a*. 45 

3. Write I/O (2) is completed from appllcatbn to 
subsystem 2 at time *b\ 

4. Primary Subsystem 1 sends data for I/O (1) to 
secondary subsystem 1 along with its associated 
time value 'a*. Secondary subsystem 1 buffers 50 
the data but does not apply it to its cache copy for 

the DASD volume. 

5. Subsystem 3 sends a checkpoint message 
containing checkpoint sequ nee number 'n' and 
time value 'b'. 55 

6. Writ I/O (3) is compi ted from application sys- 
tem to subsystem 1 at time *c\ 

7. Primary Subsystem 1 s nds data for I/O (3) to 



secondary subsystem 1 along with its associat d 
tim valu 'c'. S condary subsystem 1 buffers 
the data but does not apply it to its each copy for 
the DASD volume. 

8. Primary subsystem 1 receives and processes 
the checkpoint message sent by subsystem 3 in 
step 5. 

9. Primary subsystem 1 sends checkpoint mes- 
sage to secondary subsystem 1. which forwards 
it to secondary subsystem 4. 

10. Primary Subsystem 2 sends data for I/O (2) 
to secondary subsystem 2 along with its associ- 
ated time value 'b'. Secondary subsystem 2 buf- 
fers the data but does not apply it to its cache 
copy for the DASD volume. 

1 1 . Primary subsystem 2 receives and processes 
the checkpoint message sent by subsystem 3 In 
step 5. 

12. Primary subsystem 2 sends checkpoint mes- 
sage to secondary subsystem 2, which forwards 
it on to secondary subsystem 4. 

13. At some point between steps 5 and 13, sub- 
systems 3 and 4 have sent the checkpoint mes- 
sage 'n* to their secondary sutisystems. Second- 
ary subsystem 3 has forwarded it to secondary 
subsystem 4. 

14. Secondary subsystem 4 sends a Yelea- 
se' message to secondary subsystems 1, 2 and 
3. 

1 5. Secondary subsystem 3 having no update inv 
mediately returns an "update complete" message 
to secondary subsystem 4. 

16. Secondary subsystem 1 enters 'update in 
progress' state, then applies update (1). It does 
not apply update (3) since Its sync time val- 
ue 'c' is greater than checkpoint time value 'b*. it 
then sends an update complete message to sec- 
ondary subsystem 4. 

17. Secondary subsystem 2 enters update state 
and applies update (2), and sends an update 
complete message to secondary subsystem 4. 

18. Secondary subsystem 4 having received up- 
date complete messages from ail secondary sub- 
systems, sends a 'reset update in progress 
state' message to secondary subsystems 1, 2 
and 3. 

Now if a primary site failure happens at any point 
in the above sequence the secondary DASD will eith- 
er show none of the updates, or will show updates (1) 
and (2). Update (3) will not be 'applied' to secondary 
subsystem 1's DASD and cache until the next check- 
point sequence. 

Accordingly, through the pr sent system a se- 
quence consistent real-time asynchronous copy sys- 
tem is provided that does not require the use of cen- 
tral communications service. In so doing, a system is 
provided that requir s minimal modification, while 
utilizing existing capabilities within the DASD subsys- 
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terns to provide for s quence modifications. 



Claims 

5 

1. A method of operating an asynchronous remote 
data copy system (20) induding a primary site 
having a first plurality of sut>systems (12) Inter- 
connected by a first coupling means (16), and a 
secondary site remote from the primary site hav- io 
ing a second plurality of subsystems (14) inter- 
connected by a second coupling means, each of 

the second plurality of subsystems being inde- 
pendently cdupled to a counterpart one of the 
first plurality of subsystems, said method conv is 
prising the steps of: 

in the first plurality of subsystems; 

sending a checkpoint signal to each of the 
first plurality of subsystems; 

sending updated data and the checkpoint 20 
signal to each of the counterpart coupled second 
plurality of subsystems; and 

in the second plurality of subsystems; 

.receiving the updated data and checkpoint 
signals; and 2S 

coordinating the writing of the updated 
data based upon the checkpoint signals. 

2. The method of claim 1, further comprising the 
step of distributing a sequence signal at the pri- 30 
mary site, said checkpoint signal having a prede- 
termined relattonship with the sequence signal. 

3. The method of claim 1 or 2 further comprising the 
steps of. 35 

activating each of the first plurality of sub- 
systems to communicate with the other subsys- 
tems in said first plurality of subsystems; 

activating each of the first plurality of sub- 
systems to comnrujnicate with its counterpart 4o 
coupled subsystem in the second plurality of sub- 
systems; 

building at least one configuration table in 
each of the first plurality of subsystems such that 
each of the first plurality d[ subsystems can iden- 45 
tif y all of the other subsystems in said first plur- 
ality of subsystems; and 

synchronizing the first plurality of subsys- 
tems. 

50 

4- The method of any preceding daim in which the 
coordinating step further comprises the steps of: 

recefving copy active messages in the 
second plurality of subsystems; 

building copy active tab! s in the s cond 55 
plurality of subsystems; 

synchronizing copy op rattons of the sec- 
ond plurality of subsyst ms; 



r ceiving th ch ckpoint m ssag s in the 
s cond plurality of subsyst ms; 

performing a rendezvous for all check- 
point messages in the s cond plurality of subsys- 
tems; and 

applying the updated data at the second 
plurality of subsystems. 

5. The method of any preceding dalm, further com- 
prising the steps of: 

causing each of said first plurality of sub- 
systems to asynchronously generate a sequence 
of updates; 

ordering each of said sequences in accor- 
dance with the received sequence signal and 
chedcpoint signal; 

asynchronously communicating sequenc- 
es of updates from each of the first plurality of 
subsystems into a buffered portion of a counter- 
part subsystem; and 

applying the buffered sequences at the 
second plurality of subsystems as a function of 
each checkpoint message. 

6. A method of operating an asynchronous remote 
data copy system (20) induding a primary site 
having a first plurality of subsystems (12) inter- 
connected by a first coupling means (16), and a 
secondary site remote from the primary site hav- 
ing a second plurality of subsysten^ (14) inter- 
connected by a second coupling means, each of 
the seioond plurality of subsystems being inde- 
pendently coupled to a counterpart one of the 
first plurality of subsystems, said method com- 
prising the steps of: 

(a) at the primary site responsive to initiation 
of a start copy operation, ascertaining at each 
subsystem the subset of the plurality of DASD 
subsystems forming the copyset group, des- 
ignating one of the plurality of subsystems as 
a docking and checkpoint message source, 
each checkpoint message induding a se- 
quence dock value and an increased se- 
quence number; 

(b) at the secondary site, repeating step (a) for 
counterpart ones of the DASD subsystems; 

(c) at the primary site, periodically generating 
docking signals and checkpoint messages by 
said designated subsystem and broadcasting 
said signals and messages as they occur to 
other subsystems forming the copyset group 
induding itself, at each subsystem in the 
copyset group, and 

(1) asynchronously forming a I cal se- 
quence of updated records, 

(2) emt>edding said signals and messages 
int the s quenc t f rm a tim discrim- 
inated total ordering of updat d r cords. 
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and 

(3) remitting at least a portion of th se- 
quence to a buffer portion of th count r- 
pafX DASD subsyst m at the secondary 
site; and 

(d) at th secondary site, applying a check- 
point message to the designated subsystem 
operative as a synchronizing source by each 
subsystem in the copyset group and writing 
the sequences or portions thereof stored in 
the buffered portions of said subsystems to 
DASD only upon a signal from said designat- 
ed subsystem indicative of its receipt of all 
send messages. 

7. The method according to daim 6. v^^erein each 
checkpoint message signifies that all DASD re- 
cords with a dock signal with a sequence number 
that is less than the checkpoint message se- 
quence dock value have been transmitted to the 
counterpart secondary subsystem. 

8. An asynchronous remote data copy system (20) 
induding a primary site having a first plurality of 
subsystems (12) interconnected by a first cou- 
pling means (18), and a secondary site renrKJte 
from the primary site having a second plurality of 
subsysten^ (14) interconnected by a second 
coupling means, each of the second plurality of 
subsystems being independently coupled to a 
counterpart one of the first plurality of subsys- 
tems, 

tlie first plurality of subsystems induding 
means for sending a checkpoint signal to each of 
the first plurality of subsystems; and means for 
sending up>dated data and the checkpoint signal 
to each of the counterpart coupled second plur- 
ality of subsystenr^; 

and the second plurality of subsystems In- 
duding means for receiving the updated data and 
checkpoint signals; and means for coordinating 
the writing of the updated data based upon the 
checkpoint signals. 

9. The system of daim 8, further comprising the 
means for distributing a sequence signal at the 
primary site, said checkpoint signal having a pre- 
determined relationship with the sequence sig- 
nal. 

10. The system of daim 9 or 10 in which the chedc- 
point signal sending means comprises: 

means for sending a checkpoint message 
to the first plurality of subsystems; and 

means responsive to the checkpoint mes- 
sage sending means for inserting the checkpoint 
message into an update data sequence from 
ach of said first plurality of subsyst ms to its 
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